Backtracking with the 0-1 Knapsack problem(Depth-First Search)(0-1 배낭 문제-깊이우선탐색)

2011-12-24

Recall that in this problem we have a set of items, each of which has a weight and a profit. The weights and profits are positive integers. A thief plans to carry off stolen items in a knapsack, and the knapsack will break if the total weight of the items placed in it exceeds some positive integer W. The thief’s objective is to determine a set of items that maximizes the total profit under the constraint that the total weight cannot exceed W.

We can solve this problem using a state space tree exactly like the one in the Sum-of-Subsets problem. That is, we go to the left from the root to include the first item, and we go to the right to exclude it. Similarly, we go to the left from a node at level 1 to include the second item, and we go to the right to exclude it, and so on. Each path from the root to a leaf is a candidate solution.

This problem is different from the others discussed in this chapter in that it is an optimization problem. This means that we do not know if a node contains a solution until the search is over. Therefore, we back-track a little differently. If the items included up to a node have a greater total profit than the best solution so far, we change the value of the best solution so far. However, we may still find a better solution at one of the node’s descendants (by stealing more items). Therefore, for optimization problems we always visit a promising node’s children. The following is a general algorithm for backtracking in the case of optimization problems.

0-1 배낭 문제는 각각 무게와 이익(무게와 이익은 양의 정수)을 가진 아이템의 집합이 존재한다. 도둑은 훔친 물건을 배낭에 담아갈 계획이고, 배낭 안에 넣은 아이템의 총 무게가 W를 초과하면 배낭은 찢어진다. 도둑의 목표는 총 무게가 W를 초과할 수 없다는 제약하에 총 이익이 최대가 되는 아이템의 집합을 구하는 것이다.

0-1 배낭 문제도 앞서 봐왔던 문제들과 마찬가지로, 상태공간트리를 구축하여 백트래킹 기법으로 문제를 해결한다. 루트에서 왼쪽으로 가면 첫번째 아이템을 배낭에 넣는 경우이고, 오른쪽으로 가면 첫번째 아이템을 배낭에 넣지 않는 경우이다. 동일한 방법으로 두 번째 아이템을 넣으면 레벨 1의 노드에서 왼쪽으로 가고, 넣지 않으면 오른쪽으로 간다. 이런 방법으로 계속하여 상태공간트리를 구축하면, 루트 노드에서부터 잎 노드까지의 모든 경로는 해답 후보가 된다.

이 문제는 최적화 문제(optimization problem)라는 점에서 앞서 다룬 다른 문제들과 다르다. 최선이라고 여겼던 노드를 선택했다고 해서 실제로 그 마디로부터 최적의 해가 항상 나온다는 보장은 없다. 즉, 검색이 완전히 끝나기 전에는 해답을 알 수가 없으므로 검색하는 과정 동안 항상 그 때까지 찾은 최적의 해를 기억해 두어야 한다.

Example Suppose that n = 4, W = 16, and we have the following:

i	$p_i$	$w_i$	$p_i / w_i$
1	$40	2	$20
2	$30	5	$6
3	$50	10	$5
4	$10	5	$2

[The pruned state space tree produced using backtracking in above Example. Stored at each node from top to bottom are the total profit of the items stolen up to the node, their total weight, and the bound on the total profit that could be obtained by expanding beyond the node. The optimal solution is found at the node shaded in color. Each nonpromising node is marked with a cross.]

상태공간트리에서 X표시는 유망하지 않은(nonpromising) 노드를 뜻하며 이런 노드들은 탐색하지 않는다. 각 노드에는 3가지 값들이 들어가 있는데 제일 위부터 profit, weight, bound 값이다. bound가 98인 경우는 40 + 50 + (16 - 12) $\times$ 10/5로 구해진다.

이해를 돕기 위해 하나의 예를 들자면, (3, 4)의 경우 현재까지 구한 최고 profit은 (3, 3)에서 구한 90이다. 그런데 (3, 4)를 방문함으로써 얻을 수 있는 bound값은 50이기 때문에, 90 > 50 이므로 (bound값이 maxprofit보다 작다) 그 노드를 방문하지 않는다. 즉, (3, 4)는 유망하지 않으므로 가지치기 당한다.

The Backtracking Algorithm for the 0-1 Knapsack Problem

Algorithm Design

We first order the items in nonincreasing order according to the values of $p_i / w_i$, where $w_i$ and $p_i$ are the weight and profit, respectively, of the ith item. Next we greedily grab items, adding their profits to bound and their weights to totweight, until we get to an item that if grabbed would bring totweight above W. We grab the fraction of that item allowed by the remaining weight, and we add the value of that fraction to bound. Suppose the node is at level i, and the node at level k is the one that would bring the sum of the weights above W. Then

$$
\begin{align}
totweight &= weight + \sum_{j=i+1}^{k-1}w_j \\
bound &= \left(profit + \sum_{j=i+1}^{k-1}p_j\right) + (W - totweight) \times \frac{w_k}{p_k}
\end{align}
$$

profit: the sum of the profits of the items included up to the node.
weight: the sum of the weights of those items.
totweight: the sum of the weights could be obtained by expanding beyond that node (W를 초과할 수 없다.)
bound: the upper bound on the profit that could be obtained by expanding beyond that node.

노드가 레벨 i에 있다고 하고, 레벨 k에 있는 노드에서 총 무게가 W를 넘었을때의 계산방법이다. 식이 복잡해보이지만 전혀 그렇지 않다. 인덱스가 k일때는 W를 초과하므로 k-1까지의 합을 구 한 것이고 총 무게에서 k-1번까지의 더한 무게를 뺀 후 남은 무게만큼 이익을 쪼개서 채워넣은 것이다.

If maxprofit is the value of the profit in the best solution found so far, then a node at level i is nonpromising if
$$
bound <= maxprofit
$$

maxprofit은 현재까지 찾은 최고이익 값이다. 만약 노드를 확장했을 때의 bound 값이 maxprofit 보다 작거나 같다면, 즉 상기 부등식이 성립하면 레벨 i에 위치한 노드들은 유망하지 않다.

알고리즘의 단계를 구체적으로 세분화하면 다음과 같다.

Step 1 bound와 totweight를 profit과 weight 값으로 초기화한다.
Step 2 다음으로 탐욕적인 방법으로 아이템을 취한다.
Step 3 이 과정을 totweight이 W를 초과하게 되는 아이템을 잡을 때까지 반복한다.
Step 4 남은 무게만큼 마지막 아이템의 일부분을 취한다.

다음은 전체적인 알고리즘의 흐름이다. 먼저 깊이우선탐색으로 각 노드를 방문하고 아래에 명시된 단계들을 수행한다.

Step 1 해당 노드의 profit과 weight를 계산한다.
Step 2 해당 노드의 bound를 계산한다.
Step 3 weight < W 이고, bound > maxprofit이면, 검색을 계속한다. 그렇지 않다면 백트래킹한다.

Pseudo Code

void knapsack(index i, int profit, int weight)
{   
  if (weight <= W  &&  profit > maxprofit)
  {   
    // best so far    
    maxprofit = profit;   
    numset = i;   
    bestset = include;  
  }

  if  (promising(i))
  {     
    include[i+1] = "YES";  // Include w[i+1]    
    knapsack(i+1, profit+p[i+1], weight+w[i+1]);    
    include[i+1] = "NO";  // Not include w[i+1]     
    knapsack(i+1,  profit,  weight);  
  }
}

bool promising(index i)
{   
  index j, k;    
  int totweight;    
  float bound;  

  if (weight >= W)     
    return false;   
  else
  {     
    j = i + 1;    
    bound = profit;     
    totweight = weight;   
    while ((j <= n) && (totweight + w[j] <= W))
    {
      // Grab as many items as possible.
      totweight = totweight + w[j];         
      bound = bound + p[j];  
      j++   
    }     
    k = j;    
    if (k <= n)
    {
      // Grab fraction of kth item.
      bound = bound + (W–totweight)*p[k]/w[k];
    }
    return  bound > maxprofit;   
  }
}

Source Code

// File: knapsack_dfs.h

namespace algorithms
{
  void knapsack(int i, int profit, int weight);
  // Problem: Let n items be given, where each item has a weight and
  // a profit. The weights and profits are positive integers. Furthermore,
  // let a positive integer W be given. Determine a set of items with
  // maximum total profit, under the constraint that the sum of their
  // weights cannot exceed W.
  // Inputs: Positive integers n and W; arrays w and p, each indexed from
  // 1 to n, and each containing positive integers sorted in nonincreasing
  // order according to the values of p[i]/w[i].
  // Outputs: an array bestset indexed from 1 to n, where the values of
  // bestset[i] is "yes" if the ith item is included in the optimal set
  // and is "no" otherwise; an integer maxprofit that is the maximum
  // profit.

  bool promising(int i, int profit, int weight);
}

// File: knapsack_dfs.cpp
#include <iostream>
#include <iomanip>
#include <cstdlib>
#include <algorithm>
#include <ctime>
#include "knapsack_dfs.h"
using namespace std;

// GLOBAL VARIANCES and CONSTANT
const int n = 4;
const int W = 16;
int w[n+1] = {0, 2, 5, 10, 5}; // w[0] is meaningless
int p[n+1] = {0, 40, 30, 50, 10}; // p[0] is meaningless
string* include = new string[n+1];
string* bestset = new string[n+1];
int prm_node_cnt = 0;
int nprm_node_cnt = 0;
int numbest = 0;
int maxprofit = 0;

namespace algorithms
{
  void knapsack(int i, int profit, int weight)
  {
    // This set is best so far. Set numbest to number of items considered.
    // Set bestset to this solution.
    if (weight <= W && profit > maxprofit)
    {
      maxprofit = profit;
      numbest = i;
      copy(include, include + n+1, bestset);
    }

    if (promising(i, profit, weight))
    {
      ++prm_node_cnt; // Count the number of promising node
      include[i+1] = "YES"; // Inlcude w[i+1]
      knapsack(i + 1, profit + p[i+1], weight + w[i+1]);
      include[i+1] = "NO";  // Do not include w[i+1]
      knapsack(i + 1, profit, weight);
    }
    else
    {
      ++nprm_node_cnt; // Count the number of non-promising node
    }
  }

  bool promising(int i, int profit, int weight)
  {
    int j, k;
    int totweight;
    float bound;

    // Node is promising only if we should expand to its children.
    // There must be some capacity left for the children.
    if (weight >= W)
    {
      return false;
    }
    else
    {
      j = i + 1;
      bound = (float)profit;
      totweight = weight;

      while (j <= n && totweight + w[j] <= W)
      {
        // Grab as many items as possible.
        totweight = totweight + w[j];
        bound = bound + p[j];
        ++j;
      }

      k = j; // Use k for consistency with formula in text.
      if (k <= n) // Grab fraction of kth item.
      bound = bound + (W - totweight) * (p[k] / w[k]);

      return (bound > maxprofit);
    }
  }
}

int main( )
{
  clock_t start, end;

  start = clock( );
  algorithms::knapsack(0, 0, 0);
  end = clock( );

  for (int j = 1; j <= numbest; ++j)
    cout << "w"<< j << ": " << bestset[j] << setw(3);

  cout << endl << "The answer: $" << maxprofit << endl;
  cout << "The num of promising node: " << prm_node_cnt << endl;
  cout << "The num of nonpromising node: " << nprm_node_cnt << endl;
  double result = (double)(end-start) / CLOCKS_PER_SEC;
  cout << "Elpased time is: "<< result << " sec." << endl;

  delete [ ] include;
  delete [ ] bestset;

  return EXIT_SUCCESS;
}

Time Complexity Analysis

Worst-Case Time Complexity

$O(n) = 1 + 2 + 2^2 + 2^3 + … + 2^n = 2^{(n+1)} - 1 \in \Theta(2^n)$

$w_i$를 포함시키느냐 그렇지 않느냐의 두 가지 선택이므로 상태공간트리 내 전체 노드의 수는 최대 $2^{n+1} - 1$개가 된다. 최악의 경우 방문하는 노드의 수가 지수이지만, 가지치기를 통해 상태노드의 수를 줄일 수 있기 때문에 백트래킹 알고리즘의 효율은 더 좋을 수 있다.

Comparing the Algorithm Techniques for the 0-1 Knapsack Problem

Problem	Algoritm Technique	Worst-Case Time Complexity
0-1 Knapsack Problem	Dynamic Programming	$O(min(2^n, nW))$
0-1 Knapsack Problem	Backtracking(depth-first search)	$\Theta(2^n)$

Owing to the additional bound of nW , it may appear that the dynamic programming algorithm is superior. However, in backtracking algorithms the worst case gives little insight into how much checking is usually saved by backtracking. With so many considerations, it is difficult to analyze theoretically the relative efficiencies of the two algorithms.

nW 덕분에 동적계획 알고리즘으로 문제를 해결하는게 더 좋은 것처럼 보일 수 있다. 그러나 백트래킹 알고리즘에서 최악의 경우를 따지면 실제 백트래킹으로 방문하는 노드의 수를 얼마나 절약했는지 반영이 안되므로, 두 알고리즘의 상대적인 효율을 이론적으로 분석하기는 어렵다.

Backtracking Exercises

Use the Backtracking algorithm for the 0-1 Knapsack problem to maximize the profit for the following problem instance. Suppose that n = 5, W = 9, and we have the following:

i	$p_i$	$w_i$	$p_i / w_i$
1	$20	2	$10
2	$30	5	$6
3	$35	7	$5
4	$12	3	$4
5	$3	1	$3

w1: YES  w2: NO  w3: YES
The answer: $55
The num of promising nodes: 6
The num of nonpromising nodes: 7
Elpased time is: 0 sec.

Blog Powered by Hexo

AlgorithmsBacktracking